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Sinusoidal matcbingpinsuic 



Extended basis functions for sinusoidal matching 

pursuit algoritlnns 

Bert den Brinker 
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; Sinusoidal audio codiiig schemes assume the existezice of so-called sisuisoidal tracks- The 

trade in^CTiation is extracted from the ori^zial signal, stored or teassxDittedi and used for 
synthesis. The extractioD of the track Information is done lyy segmenting the signal and 
employing matching pursuit algorithms where the dictionary consists of truly simisoidBl 
patterns^ i.e, locally (per segment) the traxikp are modelled as constant-amplitude, constant- 
frequency sximsoidal patteme. It i$ proposed to extend these basis functions in matcihiBg 
pursuit algorithms to malce a nciore po'wezfull and efficient dictionazy. 

Things known so far 

In a sinusoidal audio coding scheme, an audio signal is represented by sinusoidal compo- 
nents mainly. Topically, these components are extracted from the sign^ on a regular basis 
(constant update rate). For efficient coding, the frequencies, amplitudes and/or phases 
of sinusoidal composnents (which axe evolving in time) in consecutive intervals are linked 
together rach that differential coding of the frequencies and amplitudes can be applied. 
This is usually done by assuming that the tracks can be modelled as constant-amplitude, 
constant-frequency sinusoidal patt^as in sufficiently small intervals [1, 2; 3, 4, 5, 6, 7]. 

This approach has recently been questioned. It was conjectured that not all relevant 
information could be extracted by the dictionary of sinusoidal patterns, and therefore it was 
proposed to use damped sinusoids [8, 9]. We note that, so £Eur, it has not been conclusively 
established how rdevant this extension is in practical situatioxis and, furthennore, the 
newly introduced degrees of freedom (dunping parameters) are contained in a nonlinear 
way in the error. This means that costly search procedures are required for the extraction 
of this information. 
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The problem for which this invention brings the 
solution 

In matchiag puxsuit algozitluQs, one iisually sees that coiistant-am]>Utade, constant-firequency 
aiinuoidal patterns do not reduce the specteal peaks sufSciently : appazentiy the assumption 
of sjnosoidal patterns per segment is violated. There are two possible consequences^ 

The first one is that the matching pursuit algorithm will model a single spectral peak 
by several dietixict frequencies. In most cases, this is not necessary since the hnldng process 
(if applied property) will induce a broadening of the spectral peaks due to amplitude or 
frequency glides. So in as fer as a single peak is modelled by more than one frequency 
and the spurious pealcs are not discarded by the psycho-acoustic model, they will more 
pzobabty be a burden than a better description. 

The second possible consequence is if a peak is moddled as a single sinusoid only, thao 
the induced side-peaks hy subtracting the sinusoidal pattern may constitute a problon. 
The cleaned signal is usuaJly sent to another processing Stage (e.g, residual or noise coder). 
Again, we get the problem that the linking introduces a broadening of the peaJc, and 
therefore the total effect is that parts of a spectral peak are modelled twice. 

Since the model of constant-ampKtude, constant-frequency localised patterns is appar- 
«itly not suflSdent, we propose to extend the basis functions such that these patterns cover 
all possible minor deviations from the assumption of stationarity. The first-order deviations 
can be modelled as linear amplitude and/or frequency glides. Furthermore, we want the 
ext^ded basis to include patterns iMroviding fair approximations to wave fiaans resulting 
ftom two sinusoidal tracks where the individual track information is lost due to the finite 
frequency resolution introduced by the segmentation. Lastly, we require that the extended 
basis does not bring about a laige increase in computational complexity. The simplest way 
to do this is to consider degrees of freedom which appear linearly in the considered errcn- 
signal. 



Embodiment 

In order to prevent the estimation of spurious peaJc around an already estimated frequency, 
it is suggested to clean the spectrum in the neighbourhood of the estimated frequency. The 
simplest vray of doing so is by estimating a polynomial expansion around the estimated 
harmonic, i.e. fitting not only . , 

so(t)=4oe*** 0) 

S;,(t) = Ab**e^' (2) 
with it = 1 , - - - , IT. In this way, not only the Fourier transform of the approjdmation equals 
the oiigmal at a specific frequency but also tiie first K doivatives at this frequency are 
equalised. 
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For reaJ-valued signals this zesiilts in 2K patterns 

Skit) and slit) (3) 

where ^ denotes conjugatioii and where we optinaise over u; and Ajb (Jb 0» - • • , K). The 
parameters Ak are aJI linearly involved in a quadratic optimisation (this in contrast to the 
proposal in [8, 9]) and the Grammian can be calculated analytically for each 

Truncation of the eicpansion (i.e. the value of K) can be done independently for each 
spectral peak and can be made dependent on the convergence of the polynomial series. 

The information of the higher order patterns can be used in the linking proce^. This 
information can be used to make more refined ptedictiona of amplitude, frequency and 
phase of the considered sinusoidal in the next segment or, alternatively, can be used to 
steer the tinkfTig process. 

'Whether or not the extended data needs to be transmitted/stored has to be considered. 
This can again be done on basis the data itself. After having established the parameters 
and the track links, a p^cho^acoustic model can determine the additional impact of the 
extra components on the sound quality. It is conjectured that in most cases this information 
need not be stored /transmitted but, rather, that is is more important to use the extended 
basis to prevent modelling of artifacts due to the improper modelling of already extracted 
simisoidal components. 

Application areas 

Sinusoidal audio coding 
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Appendix 

In order to avoid spending bits/analysis effort on. signal components caused by slight vari-> 
ations with respect to sinusoidal basis functions (e.g., small amplitude variations, small 
frequejicy sweeps), it may be worthwhile to extract not only pnre sinxasoids, but to remove 
such spurious activity as well. In that case, we will not spend any effort on estimating 
the^e side efEects as sinusoidal components, nor will they become part of the input to the 
noise coder^ 

Regression with products of harmonics and polynomials 

Consider the signal 3{t) with 5 6 E and * e R We want to provide an approximation s of 
this signal 5 on the interval (-T/2, T/2). We consider the case of modelling according to 



where n 6 and a,6,c € tl 
Amplitude variations 

It is obvious that seconds-order amplitude variations can be handled. Suppose the true 
signal 3 is given by s(fy « K(t) cos{a^o< + ^} with * € R Around t = 0 we have the Taylor 
series eo^ansion 

s{t) « (A + Bt + C^) coa W + ^} (5) 

with A = JSr(0), B = JS:'(0) and 67 = K"{Q)f2. Consequently, tie model apprarmoation 
can be talcen as H = Ua^ a = A^*, h =: B^*, e = C^'^. In ordeor tliat the T^lor series 
expansion eoxistitutes an accurate apptoodmation, it is iiequixed that K is api^oximately 
band limited with cut-off frequent^ 2k jT. We note that all complex amplitudes o, c axe 
in phase. 

fteqiiency vanatiozi3 

Con^der (small) linear firequttu^ sweeps: 



S(t) » {[fl + 6t + rf*]©****} 



(4) 



(6) 
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then we have as aa approximation 

» «{[tt+ci»]e*"*}»*(i) (7) 

with s 0 s and c =■ X^e**+^/». For the appradmatum to hold it is requited 
that 

Aw(r/2)* «: 1. (8) 

We note that a and c are out of phase. 
Non-r^olvable frequencies 
Consider the signal 

s(t) = Ai cos{a/i* + ^i} + A2 cos{a;4« + ^} (9) 

tnth [ci/i — u;3| < 2ir/T and ^1, ^ € S. Without loss of generality, we consider toi <u^. 
Fnrthennore, we define with a;! < wo < and 



We note that Ai and A2 have opposite signs. We then have 

« {Bi^'^''*(l + iAit - (Ai) V/2 + • . 0 + -Bae**^* (1 + jAa* - {^2)^/2 + •••)} 

« Sfi{K5a + S2)i-i(J5iAi + SaA2)t-(SiAj+B2A5)/2j2]e'"«'«} 

= »{[a + Wi.<#'y"»} = S(*) (10) 

with 0 = Wo, a = + Bs, 6 = i(J5i Ai + -1% Ae) and c = -(BiA? -f- ^AD/2. Note that 
the choice of <jo detennines the appzoxiiziation coefficients b and c. 
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Fig. 1 



Fig, 1 shows an embodunant of ihe ixxvesntion* An audio and/ or speech signal Al is furnished 
io a parametric eaicoder and coded info an enc^^ Hie 
encoded signal A2 is ttansmitted over a communication channel or stored onastorage 
mediion. A parametric decoder obtains tfie encoded sigoal firom die communzcstion channel 
or storage medium and decodes this signal A2 into a decoded audio and/or speech signal Al * 
which is a xepresratatian of Al . The parametric encoder according to this embodiment of the 
itnr^tioQ extracts track informalionfiomAl by raaploying a matching pursuit where Ifae 
dictionaxy comprises extended basis functions as described above. lioformation on the 
relevant mended basis ihnctions xaay be included in the bit-stream A2 and transmitted to die 
decoder. En the decoder, on die basis of the infoimatioa ptesent in the bit-stream A2, a 
reconstruction of die original audio signal is made: Al'. In this reconstruction in the decoder, 
die information in A2 on die extended basis functions may be used. 
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CLAIMS ; 



1 . A paratxietdc codixig method of encoding an audio (and/ or speech) signal, 
which method comprises the step of extracting track infomation &om the audio signal by 
employing a matching pursuit algorithm wherein the dictionary comprises extended basis 
functions. 

2. A paiametric encoder for encoding an audio (and/ or speech) signal, which 
device comprises means for extracting track informalion &om the audio signal by employing 
3 matching pursuit algorithm wherein the dictionaxy comprises extended basis functions. 

3. A paiametric decoding method of decoding an encoded audio (and/ or speech) 
signal which method comprises the step of receiving tiie encoded audio signal which 
includes information on zelevant fended basis functions^ and using the information on the 
relevant extended basis functions in the reconstruction of an audio signal. 

4. A parametric decoder for decodmg an encoded audio and/ or speech signal, 
which decoder comprises means receiving the encoded audio signal which includes 
infbimation on relevant extended basis functions, and means for using ibe information on flie 
relevant extended basis fimcdons in Hlg reconstruction of an audio dgnal. 

5 . An encoded audio and/ or speech signal, whiph signal includes information on 
relevant extended basis funcdons* 
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